Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ompi/request: Support non-PML persistent requests #3701

Merged
merged 1 commit into from
Jul 31, 2017

Conversation

kawashima-fj
Copy link
Member

This commit adds the req_start member to the ompi_request_t struct. The MPI_START and MPI_STARTALL routines call this callback function instead of MCA_PML_CALL(start(...)). So components that return persistent request must set this member to their request objects.

mca_pml_base_module_t::pml_start is not deleted because MCA_PML_CALL(start(...)) is still used elsewhere across OMPI.

This commit was a part of my PR #2758, which I want to pull when the MPI Forum decide to add the persistent collective to the MPI Standard. But it will be required to implement other persistent request features. So I want to merge this before #2758.

I want to discuss this code modification at the upcoming July 2017 OMPI developer's meeting.

This commit adds the `req_start` member to the `ompi_request_t` struct.
The `MPI_START` and `MPI_STARTALL` routines call this callback function
instead of `MCA_PML_CALL(start(...))`. So components that return
persistent request must set this member to their request objects.

`mca_pml_base_module_t::pml_start` is not deleted because
`MCA_PML_CALL(start(...))` is still used elsewhere across OMPI.

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
@bosilca
Copy link
Member

bosilca commented Jun 15, 2017

I certainly understand the benefit of moving the request_start into the base request. However, I don't see the point of having the prototype of this function with a count and an array of request, as in general we are not able to alter the order of requests in the user array. Do we really expect to find ranges of similar requests and start them together ?

@kawashima-fj
Copy link
Member Author

@bosilca Certainly the prototype of ompi_request_start_fn_t looks a little odd. But the point is giving a chance to optimize communication scheduling under the rule of the MPI Standard. There is a study on this.
Masayuki Hatanaka, Atsuhi Hori, and Yutaka Ishikawa, "Optimization of MPI Persistent Communication," Submitted to EuroMPI 2013, 2013.

@bosilca
Copy link
Member

bosilca commented Jun 16, 2017

I have little doubt about the interest of having persistent requests. I was asking about the need to have the start function associated with a particular request having the count and array of requests arguments, instead of simply using a request.

@kawashima-fj
Copy link
Member Author

Let's assume:

  1. Multiple components (ob1 PML, my COLL, ...) provide persistent requests through MPI_*_INIT.
  2. Each component has a logic to start its multiple requests and may optimize the communication scheduling of those request.
  3. Requests passed to MPI_STARTALL may consist of requests from multiple components.

In this situation, MPI_STARTALL must identify the owner component of each request and call startall functions of the corresponding components.

For optimization of 2., a function having only one request argument is not suited.

My logic was:

  1. In order to identify the owner component, the ompi_request_t structure should have a new member.
  2. In order to call the corresponding startall function, using the function pointer as the new member is simple.

Of course, there are other ways to associate a request with a startall function.

enum persistent_module_type {
    PERSISTENT_MODULE_TYPE_NULL,
    PERSISTENT_MODULE_TYPE_PML,
    PERSISTENT_MODULE_TYPE_COLL,
   ...
}

union persistent_module_t {
    void                   *ptr;
    mca_pml_base_module_t  *pml;
    mca_coll_base_module_t *coll;
    ....
}

struct ompi_request_t {
    ....
    enum persistent_module_type module_type;
    union persistent_module_t   module;
}

int MPI_Startall(...)
{
    union persistent_module_t module = { .ptr = NULL };
    for (i = 0, j = -1; i < count; ++i) {
        // Call a callback function per block of requests
        // which belong to the same module
        if (requests[i]->module.ptr != module.ptr && i != 0) {
            switch (requests[i - 1]->module_type) {
            case PERSISTENT_MODULE_TYPE_PML:
                module.pml->start(i - j, requests + j);
                break;
            case PERSISTENT_MODULE_TYPE_COLL:
                module.coll->start(i - j, requests + j);
                break;
            ...
            }
            module.ptr = requests[i]->module.ptr;
            j = i;
        }
    }
    // the last requests block
    switch (requests[i - 1]->module_type) {
    case PERSISTENT_MODULE_TYPE_PML:
        module.pml->start(i - j, requests + j);
        break;
    case PERSISTENT_MODULE_TYPE_COLL:
        module.coll->start(i - j, requests + j);
        break;
    ...
    }
}

Is there more smart solution?

@bosilca
Copy link
Member

bosilca commented Jun 16, 2017

Right, this solution rely heavily on the assumption that the user will place the persistent requests in the array for startall based on their logical generator. That's the only point I was trying to make.

@bosilca
Copy link
Member

bosilca commented Jul 13, 2017

Please update the padding for the predefined_request_t in request.h to reflect the addition of the extra function pointer.

@kawashima-fj
Copy link
Member Author

@bosilca Do we need to update the padding of ompi_predefined_request_t? ompi_predefined_request_t is 256 bytes regardless of ompi_request_t, which is at most 192 bytes or so.

Current definition in request.h:

#define PREDEFINED_REQUEST_PAD 256

struct ompi_predefined_request_t {
    struct ompi_request_t request;
    char padding[PREDEFINED_REQUEST_PAD - sizeof(ompi_request_t)];
};

@bosilca
Copy link
Member

bosilca commented Jul 20, 2017

I think we concluded during the face-to-face meeting that there is no need to change the padding.

@kawashima-fj kawashima-fj merged commit ebc4eb3 into open-mpi:master Jul 31, 2017
@kawashima-fj kawashima-fj deleted the pr/non-pml-persistent branch July 31, 2017 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants